Computational Complexity of Probabilistic Disambiguation

نویسنده

  • Khalil Sima'an
چکیده

Recent models of natural language processing employ statistical reasoning for dealing with the ambiguity of formal grammars. In this approach, statistics, concerning the various linguistic phenomena of interest, are gathered from actual linguistic data and used to estimate the probabilities of the various entities that are generated by a given grammar, e.g., derivations, parse-trees and sentences. The extension of grammars with probabilities makes it possible to state ambiguity resolution as a constrained optimization formula, which aims at maximizing the probability of some entity that the grammar generates given the input (e.g., maximum probability parse-tree given some input sentence). The implementation of these optimization formulae in efficient algorithms, however, does not always proceed smoothly. In this paper, we address the computational complexity of ambiguity resolution under various kinds of probabilistic models. We provide proofs that some, frequently occurring problems of ambiguity resolution are NP-complete. These problems are encountered in various applications, e.g., language understanding for textand speech-based applications. Assuming the common model of computation, this result implies that, for many existing probabilistic models it is not possible to devise tractable algorithms for solving these optimization problems. JEL codes: D24, L60, 047

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

In Proceedings of the 18 th International Conference on Computational Linguistics , 2000 , Saarbrücken . Using a Probabilistic Class - Based Lexiconfor

This paper presents the use of probabilistic class-based lexica for disambiguation in target-word selection. Our method employs minimal but precise contextual information for disam-biguation. That is, only information provided by the target-verb, enriched by the condensed information of a probabilistic class-based lexicon , is used. Induction of classes and ne-tuning to verbal arguments is done...

متن کامل

A Novel Disambiguation Method for Unification-Based Grammars Using Probabilistic Context-Free Approximations

We present a novel disambiguation method for unification-based grammars (UBGs). In contrast to other methods, our approach obviates the need for probability models on the UBG side in that it shifts the responsibility to simpler context-free models, indirectly obtained from the UBG. Our approach has three advantages: (i) training can be effectively done in practice, (ii) parsing and disambiguati...

متن کامل

Dependency Of Context-Based Word Sense Disambiguation From Representation And Domain Complexity

Word Sense Disambiguation (WSD) is a central task in the area of Natural Language Processing. In the past few years several context-based probabilistic and machine learning methods for WSD have been presented in literature. However, an important area of research that has not been given the attention it deserves is a formal analysis of the parameters affecting the performance of the learning tas...

متن کامل

A New Supervised Learning Algorithm for Word Sense Disambiguation

The Naive Mix is a new supervised learning algorithm that is based on a sequential method for selecting probabilistic models. The usual objective of model selection is to nd a single model that adequately characterizes the data in a training sample. However, during model selection a sequence of models is generated that consists of the best{{tting model at each level of model complexity. The Nai...

متن کامل

A Theoretical Analysis of Context-based Learning Algorithms or Word Sense Disambiguation

1 This paper has been published on the Proceedings of ECAI 2000 Abstract. Word Sense Disambiguation (WSD) is a central task in the area of Natural Language Processing. In the past few years several context-based probabilistic and machine learning methods for WSD have been presented in literature. However, an important area of research that has not been given the attention it deserves is a forma...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Grammars

دوره 5  شماره 

صفحات  -

تاریخ انتشار 2002